尽管深度学习模型在图像语义细分中取得了巨大进展,但它们通常需要大的注释示例,并且越来越多的注意力转移到了诸如少数射击学习(FSL)之类的问题设置中,在这些设置中,只需要少量注释才能泛化才能概括地进行概括的概括。新颖的课程。这尤其在医疗领域中可以看到,那里的像素级注释昂贵。在本文中,我们提出了正则原型神经差微分方程(R-PNODE),该方法利用神经模型的固有特性,通过额外的簇和一致性损失来辅助和增强,以执行几个器官的几分片段分割(FSS)。 R-Pnode将同一类的支持和查询功能限制在表示空间中,从而改善了基于现有的卷积神经网络(CNN)的FSS方法的性能。我们进一步证明,尽管许多现有的基于CNN的现有方法往往非常容易受到对抗攻击的影响,但R-Pnode表现出对各种攻击的对抗性鲁棒性的提高。我们在内域和跨域FSS设置中使用三个公开可用的多器官分割数据集,以证明我们方法的疗效。此外,我们在各种设置中使用七个常用的对抗攻击进行实验,以证明R-Pnode的鲁棒性。 R-Pnode的表现优于FSS的基线,并且在强度和设计方面也显示出卓越的性能。
translated by 谷歌翻译
Inspired by strategies like Active Learning, it is intuitive that intelligently selecting the training classes from a dataset for Zero-Shot Learning (ZSL) can improve the performance of existing ZSL methods. In this work, we propose a framework called Diverse and Rare Class Identifier (DiRaC-I) which, given an attribute-based dataset, can intelligently yield the most suitable "seen classes" for training ZSL models. DiRaC-I has two main goals - constructing a diversified set of seed classes, followed by a visual-semantic mining algorithm initialized by these seed classes that acquires the classes capturing both diversity and rarity in the object domain adequately. These classes can then be used as "seen classes" to train ZSL models for image classification. We adopt a real-world scenario where novel object classes are available to neither DiRaC-I nor the ZSL models during training and conducted extensive experiments on two benchmark data sets for zero-shot image classification - CUB and SUN. Our results demonstrate DiRaC-I helps ZSL models to achieve significant classification accuracy improvements.
translated by 谷歌翻译
Zero-shot detection (ZSD) is a challenging task where we aim to recognize and localize objects simultaneously, even when our model has not been trained with visual samples of a few target ("unseen") classes. Recently, methods employing generative models like GANs have shown some of the best results, where unseen-class samples are generated based on their semantics by a GAN trained on seen-class data, enabling vanilla object detectors to recognize unseen objects. However, the problem of semantic confusion still remains, where the model is sometimes unable to distinguish between semantically-similar classes. In this work, we propose to train a generative model incorporating a triplet loss that acknowledges the degree of dissimilarity between classes and reflects them in the generated samples. Moreover, a cyclic-consistency loss is also enforced to ensure that generated visual samples of a class highly correspond to their own semantics. Extensive experiments on two benchmark ZSD datasets - MSCOCO and PASCAL-VOC - demonstrate significant gains over the current ZSD methods, reducing semantic confusion and improving detection for the unseen classes.
translated by 谷歌翻译
As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.
translated by 谷歌翻译
应对深层终身强化学习(LRL)挑战的一种方法是仔细管理代理商的学习经验,以学习(不忘记)并建立内部元模型(任务,环境,代理商和世界)。生成重播(GR)是一种以生物学启发的重播机制,可以通过从内部生成模型中绘制的自标记示例来增强学习经验,该模型随着时间的推移而更新。在本文中,我们提出了一个满足两个Desiderata的GR版本:(a)使用深RL学习的策略的潜在策略的内省密度建模,以及(b)无模型的端到端学习。在这项工作中,我们研究了三个无模型GR的深度学习体系结构。我们在三种不同的情况下评估了我们提出的算法,其中包括来自Starcraft2和Minigrid域的任务。我们报告了几个关键发现,显示了设计选择对定量指标的影响,包括转移学习,对看不见的任务的概括,任务更改后的快速适应,与任务专家相当的绩效以及最小化灾难性遗忘。我们观察到我们的GR可以防止从深层批评剂的潜在矢量空间中的特征映射中漂移。我们还显示了既定的终身学习指标的改进。我们发现,当与重播缓冲液和生成的重播缓冲液结合使用时,需要引入一个小的随机重放缓冲液,以显着提高训练的稳定性。总体而言,我们发现“隐藏的重播”(一种众所周知的班级入学分类体系结构)是最有前途的方法,它推动了LRL的GR中最新的方法。
translated by 谷歌翻译
深层网络的解释性正在成为深度学习社区中的一个核心问题。在图形上学习是相同的,这是许多现实世界中存在的数据结构。在本文中,我们提出了一种比最新方法更优化,更轻,一致和更好利用评估图的拓扑的方法。
translated by 谷歌翻译
跨研究的可复制性是强大的模型评估标准,强调预测的普遍性。当训练跨研究的可复制预测模型时,至关重要的是分别合并和处理研究。我们研究了在研究中存在潜在异质性的情况下在研究中的潜在异质性之间的增强算法的增强算法,并比较了两种多研究的学习策略:1)合并所有研究并培训单个模型,以及2)多学生结合在每个研究中单独的模型,并结合产生的预测。在回归环境中,我们根据分析过渡点提供理论准则,以确定合并或合奏与线性学习者增强的合奏更有益。此外,我们表征了通过组件线性学习者提高估计误差的偏差差异分解。我们验证理论过渡点导致模拟,并说明如何指导合并与在乳腺癌基因表达数据应用中结合的决定。
translated by 谷歌翻译
深度学习的成功使得能够在需要多模式任务中的进步,这些任务需要非普通融合多个输入域。尽管多式联运模型在许多问题中表现出潜力,但它们的复杂性增加使它们更容易攻击。后门(或特洛伊木马)攻击是一类安全漏洞,其中攻击者将恶意秘密行为嵌入到网络(例如目标错误分类)中,当攻击者指定的触发添加到输入时被激活。在这项工作中,我们表明多模态网络容易受到我们称之为双关键多模式后域的新型攻击。该攻击利用最先进的网络使用的复杂融合机制来嵌入有效和隐秘的后门。该建议的攻击而不是使用单个触发器,而不是使用单个触发器在每个输入模件中嵌入触发器,并仅在存在两种触发时激活恶意行为。我们对具有多个体系结构和视觉功能底座的视觉问题应答(VQA)任务进行了广泛的研究。在VQA模型中嵌入后门的一项重大挑战是,大多数模型都使用从固定的预磨削物体检测器中提取的可视化特征。这对攻击者有挑战性,因为探测器完全扭曲或忽略视觉触发,这导致了后域在语言触发上过于依赖的模型。我们通过提出为预磨料对象探测器设计的可视触发优化策略来解决这个问题。通过这种方法,我们创建双关键的返回室,超过98%的攻击成功率,同时只毒害了1%的培训数据。最后,我们发布了Trojvqa,大量的干净和特洛伊木马VQA模型,以实现对多模式后域的捍卫的研究。
translated by 谷歌翻译
近来,由于其最先进的表现,基于深入的学习的麻木分类分类机变得很受欢迎。大多数深度的麻木分类分类器通常用高通滤波器提取噪声残差作为预处理步骤,并将它们馈送到它们的深层模型进行分类。据观察,最近的隐法嵌入并不总是限制它们在高频区的嵌入;相反,它们根据嵌入策略分发它。因此,除了噪声残余之外,学习嵌入区域是另一个具有挑战性的任务。在这项工作中,与传统方法不同,所提出的模型首先使用学习的去噪核提取噪声残差来提高信噪比。在预处理之后,稀疏噪声残差被馈送到新的多体上下文卷积神经网络(M-CNET),该新型使用异质上下文尺寸来学习噪声残差的稀疏和低幅度表示。通过结合自我关注模块,进一步改善了模型性能,以专注于容易陷入困扰嵌入的区域。进行一组综合实验以显示所提出的方案对现有技术的功效。此外,给出了一种消融研究,以证明拟议的架构的各种模块的贡献。
translated by 谷歌翻译